Search CORE

53 research outputs found

Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse

Author: Ge Jidong
Gui Jie
Li Chuanyi
Liu Yuxiang
Publication venue
Publication date: 22/03/2021
Field of study

Normalization operations are essential for state-of-the-art neural networks and enable us to train a network from scratch with a large learning rate (LR). We attempt to explain the real effect of Batch Normalization (BN) from the perspective of variance transmission by investigating the relationship between BN and Weights Normalization (WN). In this work, we demonstrate that the problem of the shift of the average gradient will amplify the variance of every convolutional (conv) layer. We propose Parametric Weights Standardization (PWS), a fast and robust to mini-batch size module used for conv filters, to solve the shift of the average gradient. PWS can provide the speed-up of BN. Besides, it has less computation and does not change the output of a conv layer. PWS enables the network to converge fast without normalizing the outputs. This result enhances the persuasiveness of the shift of the average gradient and explains why BN works from the perspective of variance transmission. The code and appendix will be made available on https://github.com/lyxzzz/PWSConv.Comment: This paper has been accepted by AAAI2

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Judicial Intelligent Assistant System: Extracting Events from Divorce Cases to Detect Disputes for the Judge

Author: Ge Jidong
Li Chuanyi
Luo Bin
Sheng Yu
Zhang Yuan
Publication venue
Publication date: 23/03/2023
Field of study

In formal procedure of civil cases, the textual materials provided by different parties describe the development process of the cases. It is a difficult but necessary task to extract the key information for the cases from these textual materials and to clarify the dispute focus of related parties. Currently, officers read the materials manually and use methods, such as keyword searching and regular matching, to get the target information. These approaches are time-consuming and heavily depending on prior knowledge and carefulness of the officers. To assist the officers to enhance working efficiency and accuracy, we propose an approach to detect disputes from divorce cases based on a two-round-labeling event extracting technique in this paper. We implement the Judicial Intelligent Assistant (JIA) system according to the proposed approach to 1) automatically extract focus events from divorce case materials, 2) align events by identifying co-reference among them, and 3) detect conflicts among events brought by the plaintiff and the defendant. With the JIA system, it is convenient for judges to determine the disputed issues. Experimental results demonstrate that the proposed approach and system can obtain the focus of cases and detect conflicts more effectively and efficiently comparing with existing method.Comment: 20 page

arXiv.org e-Print Archive

Impact of a Diagnostic Pressure Equation Constraint on Tornadic Supercell Thunderstorm Forecasts Initialized Using 3DVAR Radar Data Assimilation

Author: Guoqing Ge
Jidong Gao
Ming Xue
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

A diagnostic pressure equation constraint has been incorporated into a storm-scale three-dimensional variational (3DVAR) data assimilation system. This diagnostic pressure equation constraint (DPEC) is aimed to improve dynamic consistency among different model variables so as to produce better data assimilation results and improve the subsequent forecasts. Ge et al. (2012) described the development of DPEC and testing of it with idealized experiments. DPEC was also applied to a real supercell case, but only radial velocity was assimilated. In this paper, DPEC is further applied to two real tornadic supercell thunderstorm cases, where both radial velocity and radar reflectivity data are assimilated. The impact of DPEC on radar data assimilation is examined mainly based on the storm forecasts. It is found that the experiments using DPEC generally predict higher low-level vertical vorticity than the experiments not using DPEC near the time of observed tornadoes. Therefore, it is concluded that the use of DPEC improves the forecast of mesocyclone rotation within supercell thunderstorms. The experiments using different weighting coefficients generate similar results. This suggests that DPEC is not very sensitive to the weighting coefficients

Crossref

Directory of Open Access Journals

Security and Energy-aware Collaborative Task Offloading in D2D communication

Author: Chang Victor
Ge Jidong
Hu Haiyang
Hu Hua
Huang Binbin
Li Zhongjin
Publication venue: 'Elsevier BV'
Publication date: 01/05/2021
Field of study

Device-to-device (D2D) communication technique is used to establish direct links among mobile devices (MDs) to reduce communication delay and increase network capacity over the underlying wireless networks. Existing D2D schemes for task offloading focus on system throughput, energy consumption, and delay without considering data security. This paper proposes a Security and Energy-aware Collaborative Task Offloading for D2D communication (Sec2D). Specifically, we first build a novel security model, in terms of the number of CPU cores, CPU frequency, and data size, for measuring the security workload on heterogeneous MDs. Then, we formulate the collaborative task offloading problem that minimizes the time-average delay and energy consumption of MDs while ensuring data security. In order to meet this goal, the Lyapunov optimization framework is applied to implement online decision-making. Two solutions, greedy approach and optimal approach, with different time complexities, are proposed to deal with the generated mixed-integer linear programming (MILP) problem. The theoretical proofs demonstrate that Sec2D follows a [O(1∕V),O(V)] energy-delay tradeoff. Simulation results show that Sec2D can guarantee both data security and system stability in the collaborative D2D communication environment

Teeside University's Research Repository

Aston Publications Explorer

Energy-aware task offloading with deadline constraint in mobile edge computing

Author: Chang Victor
Ge Jidong
Hu Haiyang
Huang Binbin
Li Zhongjin
Pan Linxuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/03/2021
Field of study

Teeside University's Research Repository

Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases

Author: Ge Jidong
Huang Liguo
Liu Shangqing
Luo Bin
Tang Ze
Xu Tongtong
Zhu Tingwei
Publication venue
Publication date: 18/08/2023
Field of study

Large Language Models (LLMs) have demonstrated remarkable performance in code completion. However, due to the lack of domain-specific knowledge, they may not be optimal in completing code that requires intensive domain knowledge for example completing the library names. Although there are several works that have confirmed the effectiveness of fine-tuning techniques to adapt language models for code completion in specific domains. They are limited by the need for constant fine-tuning of the model when the project is in constant iteration. To address this limitation, in this paper, we propose

k

NM-LM, a retrieval-augmented language model (R-LM), that integrates domain knowledge into language models without fine-tuning. Different from previous techniques, our approach is able to automatically adapt to different language models and domains. Specifically, it utilizes the in-domain code to build the retrieval-based database decoupled from LM, and then combines it with LM through Bayesian inference to complete the code. The extensive experiments on the completion of intra-project and intra-scenario have confirmed that

k

NM-LM brings about appreciable enhancements when compared to CodeGPT and UnixCoder. A deep analysis of our tool including the responding speed, storage usage, specific type code completion, and API invocation completion has confirmed that

k

NM-LM provides satisfactory performance, which renders it highly appropriate for domain adaptive code completion. Furthermore, our approach operates without the requirement for direct access to the language model's parameters. As a result, it can seamlessly integrate with black-box code completion models, making it easy to integrate our approach as a plugin to further enhance the performance of these models.Comment: Accepted by ASE202

arXiv.org e-Print Archive

Learning the Relation between Similarity Loss and Clustering Loss in Self-Supervised Learning

Author: Fang Lanting
Ge Jidong
Gui Jie
Huang LiGuo
Kwok James Tin-Yau
Lin Ming
Liu Yuxiang
Luo Bin
Publication venue
Publication date: 05/06/2023
Field of study

Self-supervised learning enables networks to learn discriminative features from massive data itself. Most state-of-the-art methods maximize the similarity between two augmentations of one image based on contrastive learning. By utilizing the consistency of two augmentations, the burden of manual annotations can be freed. Contrastive learning exploits instance-level information to learn robust features. However, the learned information is probably confined to different views of the same instance. In this paper, we attempt to leverage the similarity between two distinct images to boost representation in self-supervised learning. In contrast to instance-level information, the similarity between two distinct images may provide more useful information. Besides, we analyze the relation between similarity loss and feature-level cross-entropy loss. These two losses are essential for most deep learning methods. However, the relation between these two losses is not clear. Similarity loss helps obtain instance-level representation, while feature-level cross-entropy loss helps mine the similarity between two distinct images. We provide theoretical analyses and experiments to show that a suitable combination of these two losses can get state-of-the-art results. Code is available at https://github.com/guijiejie/ICCL.Comment: This paper is accepted by IEEE Transactions on Image Processin

arXiv.org e-Print Archive

Online learning offloading framework for heterogeneous mobile edge computing system

Author: Chang Victor
Chen Xingguo
Ge Jidong
Li Chuanyi
Luo Bin
Wong Chifong
Zhang Feifei
Zhang He
Zhang Sheng
Publication venue: 'Elsevier BV'
Publication date: 06/03/2019
Field of study

Teeside University's Research Repository

LawBench: Benchmarking Legal Knowledge of Large Language Models

Author: Chen Kai
Fei Zhiwei
Ge Jidong
Han Zhuo
Shen Xiaoyu
Shen Zongwen
Zhang Songyang
Zhou Fengzhe
Zhu Dawei
Publication venue
Publication date: 28/09/2023
Field of study

Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted to have precise assessment of the LLMs' legal capabilities from three cognitive levels: (1) Legal knowledge memorization: whether LLMs can memorize needed legal concepts, articles and facts; (2) Legal knowledge understanding: whether LLMs can comprehend entities, events and relationships within legal text; (3) Legal knowledge applying: whether LLMs can properly utilize their legal knowledge and make necessary reasoning steps to solve realistic legal tasks. LawBench contains 20 diverse tasks covering 5 task types: single-label classification (SLC), multi-label classification (MLC), regression, extraction and generation. We perform extensive evaluations of 51 LLMs on LawBench, including 20 multilingual LLMs, 22 Chinese-oriented LLMs and 9 legal specific LLMs. The results show that GPT-4 remains the best-performing LLM in the legal domain, surpassing the others by a significant margin. While fine-tuning LLMs on legal specific text brings certain improvements, we are still a long way from obtaining usable and reliable LLMs in legal tasks. All data, model predictions and evaluation code are released in https://github.com/open-compass/LawBench/. We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain

arXiv.org e-Print Archive